Achieving Superlinear Speedup on Hierarchical Distributed Memory Multiprocessors

نویسندگان

  • F. F. Rivera
  • O. G. Plata
  • E. L. Zapata
  • FRANCISCO F. RIVERA
  • OSCAR PLATA
  • EMILIO L. ZAPATA
چکیده

In this paper we will present a massively parallel SPMD programming model in which the locality of the data is exploited in order to obtain high execution speeds in real problems. Superlinear speedups are achieved when the mechanisms for the management of the memory hierarchy operate eeciently. A simple model for the prediction of this behavior is introduced and some application examples in matrix algebra and pattern recognition algorithms on the KSR-1 system are presented.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Load Balancing for Extraplation Methods on Distributed Memory Multiprocessors

We presents a parallel algorithm for extrapolation methods on distributed memory multiprocessors combining diierent levels of par-allelism. A detailed analysis that uses appropriate primitives for communication shows that a sophisticated load balancing scheme is required to achieve a good speedup. We characterize an optimal load balancing based on Lagrange multipliers and investigate several si...

متن کامل

Superlinear Speedup in Parallel Computation

Speedup of a parallel computation is defined as Sp = T/Tp [2], where T is the sequential time of a problem and Tp is the parallel time to solve the same problem using p processors. Tp was argued to be no greater than P in [3]. However, in practice, people observed “superlinear speedup”, i.e. the speedup with P processors is greater than P. Two main reasons for superlinear speedup are shown in [...

متن کامل

Active Memory Techniques for ccNUMA Multiprocessors

Our recent work on uniprocessor and single-node multiprocessor (SMP) active memory systems uses address remapping techniques in conjunction with extended cache coherence protocols to improve access locality in processor caches. We extend our previous work in this paper and introduce the novel concept of multi-node active memory systems. We present the design of multi-node active memory cache co...

متن کامل

Mapping Of Backpropagation Learning Onto Distributed Memory Multiprocessors

This paper presents a mapping scheme for p a d e l pipelined execution of the Backpropagation Learning Algorithm o n dtktributed memory multiprocessors (DMMs). The proposed implementation ezhibits training set parallelism that involves batch updating. Simple algorithms have been presented, which allow the data transfer involved in both forward and backward execution3 phases of the backpropagati...

متن کامل

Diagonal Implicitly Iterated Runge Kutta Methods on Distributed Memory Multiprocessors

We investigate the parallel implementation of the diagonal implicitly iterated Runge Kutta DIIRK method an iteration method based on a predictor corrector scheme This method is appropriate for the solution of sti systems of ordinary di erential equations ODEs and provides embedded formulae to control the stepsize We discuss di erent strate gies for the implementation of the DIIRK method on dist...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1994